NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension

Karhadkar, Kedar; Murray, Michael; Montufar, Guido (September 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

Bounds on the smallest eigenvalue of the neural tangent kernel (NTK) are a key ingredient in the analysis of neural network optimization and memorization. How- ever, existing results require distributional assumptions on the data and are limited to a high-dimensional setting, where the input dimension d0 scales at least log- arithmically in the number of samples n. In this work we remove both of these requirements and instead provide bounds in terms of a measure of distance between data points: notably these bounds hold with high probability even when d0 is held constant versus n. We prove our results through a novel application of the hemisphere transform.
more » « less
Full Text Available
Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension

Karhadkar, Kedar; Murray, Michael; Montufar, Guido (September 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

Bounds on the smallest eigenvalue of the neural tangent kernel (NTK) are a key ingredient in the analysis of neural network optimization and memorization. How- ever, existing results require distributional assumptions on the data and are limited to a high-dimensional setting, where the input dimension d0 scales at least log- arithmically in the number of samples n. In this work we remove both of these requirements and instead provide bounds in terms of a measure of distance between data points: notably these bounds hold with high probability even when d0 is held constant versus n. We prove our results through a novel application of the hemisphere transform.
more » « less
Full Text Available
Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension

Karhadkar, Kedar; Murray, Michael; Montufar, Guido (September 2024, Advances in Neural Information Processing Systems (NeurIPS) 2024)

Full Text Available
Benign overfitting in leaky ReLU networks with moderate input dimension

Karhadkar, Kedar; George, Erin; Murray, Michael; Montufar, Guido; Needell, Deanna (September 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

The problem of benign overfitting asks whether it is possible for a model to perfectly fit noisy training data and still generalize well. We study benign overfitting in two- layer leaky ReLU networks trained with the hinge loss on a binary classification task. We consider input data that can be decomposed into the sum of a common signal and a random noise component, that lie on subspaces orthogonal to one another. We characterize conditions on the signal to noise ratio (SNR) of the model parameters giving rise to benign versus non-benign (or harmful) overfitting: in particular, if the SNR is high then benign overfitting occurs, conversely if the SNR is low then harmful overfitting occurs. We attribute both benign and non- benign overfitting to an approximate margin maximization property and show that leaky ReLU networks trained on hinge loss with gradient descent (GD) satisfy this property. In contrast to prior work we do not require the training data to be nearly orthogonal. Notably, for input dimension d and training sample size n, while results in prior work require d= !(n2 log n), here we require only d= ! (n).
more » « less
Full Text Available
Benign overfitting in leaky ReLU networks with moderate input dimension

Karhadkar, Kedar; George, Erin; Murray, Michael; Montufar, Guido; Needell, Deanna (September 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

The problem of benign overfitting asks whether it is possible for a model to perfectly fit noisy training data and still generalize well. We study benign overfitting in two- layer leaky ReLU networks trained with the hinge loss on a binary classification task. We consider input data that can be decomposed into the sum of a common signal and a random noise component, that lie on subspaces orthogonal to one another. We characterize conditions on the signal to noise ratio (SNR) of the model parameters giving rise to benign versus non-benign (or harmful) overfitting: in particular, if the SNR is high then benign overfitting occurs, conversely if the SNR is low then harmful overfitting occurs. We attribute both benign and non- benign overfitting to an approximate margin maximization property and show that leaky ReLU networks trained on hinge loss with gradient descent (GD) satisfy this property. In contrast to prior work we do not require the training data to be nearly orthogonal. Notably, for input dimension d and training sample size n, while results in prior work require d= !(n2 log n), here we require only d= ! (n).
more » « less
Full Text Available
Benign overfitting in leaky ReLU networks with moderate input dimension

Karhadkar, Kedar; George, Erin; Murray, Michael; Montufar, Guido; Needell, Deanna (September 2024, Advances in Neural Information Processing Systems (NeurIPS) 2024)

Full Text Available
Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

Karhadkar, Kedar; Murray, Michael; Tseran, Hanna; Montufar, Guido (June 2024, Transactions on Machine Learning Research)

We study the loss landscape of both shallow and deep, mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. We show both by count and volume that most activation patterns correspond to parameter regions with no bad local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank Jacobian to many regions having deficient rank depending on the amount of overparameterization.
more » « less
Full Text Available
Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

Karhadkar, Kedar; Murray, Michael; Tseran, Hanna; Montufar, Guido (June 2024, Transactions on machine learning research)

We study the loss landscape of both shallow and deep, mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. We show both by count and volume that most activation patterns correspond to parameter regions with no bad local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank Jacobian to many regions having deficient rank depending on the amount of overparameterization.
more » « less
Full Text Available
Characterizing the Spectrum of the NTK via a Power Series Expansion

Murray, Michael; Jin, Hui; Bowman, Benjamin; Montufar, Guido (February 2023, International Conference on Learning Representations)

Full Text Available
Following Natural Language Instructions for Household Tasks With Landmark Guided Search and Reinforced Pose Adjustment

https://doi.org/10.1109/LRA.2022.3178804

Murray, Michael; Cakmak, Maya (July 2022, IEEE Robotics and Automation Letters)

Full Text Available

« Prev Next »

Search for: All records